Towards Modular Development of Typed Unification Grammars

نویسندگان

  • Yael Sygal
  • Shuly Wintner
چکیده

Development of large-scale grammars for natural languages is a complicated endeavor: Grammars are developed collaboratively by teams of linguists, computational linguists, and computer scientists, in a process very similar to the development of large-scale software. Grammars are written in grammatical formalisms that resemble very-high-level programming languages, and are thus very similar to computer programs. Yet grammar engineering is still in its infancy: Few grammar development environments support sophisticated modularized grammar development, in the form of distribution of the grammar development effort, combination of sub-grammars, separate compilation and automatic linkage, information encapsulation, and so forth. This work provides preliminary foundations for modular construction of (typed) unification grammars for natural languages. Much of the information in such formalisms is encoded by the type signature, and we subsequently address the problem through the distribution of the signature among the different modules. We define signature modules and provide operators of module combination. Modules may specify only partial information about the components of the signature and may communicate through parameters, similarly to function calls in programming languages. Our definitions are inspired by methods and techniques of programming language theory and software engineering and are motivated by the actual needs of grammar developers, obtained through a careful examination of existing grammars. We show that our definitions meet these needs by conforming to a detailed set of desiderata. We demonstrate the utility of our definitions by providing a modular design of the HPSG grammar of Pollard and Sag.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Partially Specified Signatures: A Vehicle for Grammar Modularity

This work provides the essential foundations for modular construction of (typed) unification grammars for natural languages. Much of the information in such grammars is encoded in the signature, and hence the key is facilitating a modularized development of type signatures. We introduce a definition of signature modules and show how two modules combine. Our definitions are motivated by the actu...

متن کامل

A baseline method for compiling typed unification grammars into context free language models

This paper presents a minimal enumerative approach to the problem of compiling typed unification grammars into CFG language models, a prototype implementation and results of experiments in which it was used to compile some non-trivial unification grammars. We argue that enumerative methods are considerably more useful than has been previously believed. Also, the simplicity of enumerative method...

متن کامل

An Open-Source Environment for Compiling Typed Unification Grammars into Speech Recognisers

We present REGULUS, an Open Source environment which compiles typed unification grammars into context free grammar language models compatible with the Nuance Toolkit. The environment includes a large general unification grammar of English and corpus-based tools for creating efficient domainspecific recognisers from it. We will demo applications built using the system, including a speech transla...

متن کامل

Flexible Structural Analysis of Near-Meet-Semilattices for Typed Unification-Based Grammar Design

We present a new method for directly working with typed unification grammars in which type unification is not well-defined. This is often the case, as large-scale HPSG grammars now usually have type systems for which many pairs do not have least upper bounds. Our method yields a unification algorithm that compiles quickly and yet is nearly as fast during parsing as one that requires least upper...

متن کامل

A Framework for Inductive Learning of Typed-Unification Grammars

LIGHT, the parsing system for typed-unification grammars [3], was recently extended so to allow the automate learning of attribute/feature paths values. Motivated by the logic design of these grammars [2], the learning strategy we adopted is inspired by Inductive Logic Programming [4]; we proceed by searching through hypothesis spaces generated by logic transformations of the input grammar. Two...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2011